R, RStudio and Tidyverse Stack

The R Language

R is a scripting language and a very powerful tool for data analysis and presentation, primarily due to the huge user base and their dedication to developing free and open source libraries/packages covering a vast range of different knowledge domains:

The Comprehensive R Archive Network (CRAN) is the canonical repository for R packages, note that almost all* packages hosted on CRAN may be used within a Shiny app.

*Packages dependent on parallel or distributed computing are unlikely to be supported, contact shinyapps-support@rstudio.com with any questions

Learning R

There are thousands of online resources for learning R, many are available for free.

Two I’d like to personally endorse are:

The R Console

R is the name of the programming language and console within which many users of R write and evaluate their code.

To use R on your local machine you must download and install the R Console, it’s available on Windows, OS X and Linux.

Like all consoles, this application provides [only] the following functionality:

  • Write code and script files
  • Evaluate code and script files

RStudio/

RStudio is a free, open-source IDE (integrated development environment) that provides an extremely powerful and friendly interface for developing with R.

IDEs make it easier to manage your programming, providing the following features:

RStudio/

RStudio, however, provides much more exciting features on top of a standard IDE:

*more on RMarkdown after some actual code.

RStudio/

Base R and R Packages

When R is installed on your computer the machinery necessary to run R code is added to your computer and a number of “base” packages including; stats, utils and graphics.

See stackoverflow.com/a/9705725/1659890 for further details.

These packages will not get you far in life, unless you’re prepared to write a lot of code from scratch.

But you can guarantee* that any code samples you see online referring to “base R” only will work without having to install additional libraries.

Installing Packages

If a package is on CRAN then it is “installed” onto your using the following code, you’re advised to write this directly into the console and not into your documents

install.packages("ggplot2")

Once a library is installed, functions can be accessed using ggplot::geom_point().

However, libraries are designed to be used after being loaded:

library(ggplot2)

Warning on Packages

While packages are incredibly useful, it is important not to offload all thought/development to packages for three important reasons:

RStudio-backed Packages

RStudio/
  • The “tidyverse” is a collection of packages maintained by RStudio devs [particularly Hadley Wickham]
  • tidyverse packages play extremely nicely together
  • tidyverse packages are extremely useful for preparing data for interactive visualisations
  • tidyverse packages are highly optimised, often specifically around nitpicky details of bse R (readr is a good example of this)
  • tidyverse is the backbone of the recently published, free online book R for Data Science

Tidyverse package workflow

RStudio/
  • Import with readr
  • Reshape with tidyr
  • Filter, modify and query with dplyr
  • Visualise with ggplot (but that’s not interactive…)

RStudio has links to a fantastic cheatsheet on the tidyverse (the complicated reshaping/filtering part of it) available under Help > Cheatsheets

Installing the tidyverse

There are currently over 15 packages in the tidyverse, it’s a pain installing each of them separarely. So RStudio have made everything easy to manage via the tidyverse package:

# install.packages("tidyverse")
library("tidyverse")
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag():    dplyr, stats
tidyverse_packages()
##  [1] "broom"     "dplyr"     "forcats"   "ggplot2"   "haven"    
##  [6] "httr"      "hms"       "jsonlite"  "lubridate" "magrittr" 
## [11] "modelr"    "purrr"     "readr"     "readxl"    "stringr"  
## [16] "tibble"    "rvest"     "tidyr"     "xml2"      "tidyverse"

R Syntax Catch-up

We’re going to be using what to some users is considered advanced R programming during today’s session, but often experienced R users get tripped up over brackets. It’s good to cement into your head what each bracket is for so that when you read code you know what’s going on:

Encapsulate the arguments for a function, in the case of rep("Hello World", 2) the round brackets encapsulate the two arguments passed to the function rep - arguments are therefore deliminated by commas.

Used for extracting parts (rows, columns, individual elements) from data structures - that’s there only use

Used for containing expressions - when writing mathematical expressions by hand round brackets are usually used for controlling precedence (order of operations), but in R you should write 2*{x+1}^2.

Braces are necessary where more than one thing is being done in an individual argument

rep(
  "strings",
  {
    no1 <- 2
    no1 +3
  }
)
## [1] "strings" "strings" "strings" "strings" "strings"